Search CORE

117 research outputs found

Power-Gating: More Than Leakage Savings

Author: Calimera Andrea
Macii Enrico
Poncino Massimo
Publication venue: ACM/IEEE
Publication date: 01/01/2010
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Partitioned cache architectures for reduced NBTI-induced aging

Author: Calimera Andrea
Loghi M.
Macii Enrico
Poncino Massimo
Publication venue: IEEE
Publication date: 01/01/2011
Field of study

Archivio istituzionale della ricerca - Università degli Studi di Udine

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Layer-wise compressive training for convolutional neural networks

Author: Calimera Andrea
Grimaldi Matteo
Tenace Valerio
Publication venue: 'MDPI AG'
Publication date: 28/12/2018
Field of study

Convolutional Neural Networks (CNNs) are brain-inspired computational models designed to recognize patterns. Recent advances demonstrate that CNNs are able to achieve, and often exceed, human capabilities in many application domains. Made of several millions of parameters, even the simplest CNN shows large model size. This characteristic is a serious concern for the deployment on resource-constrained embedded-systems, where compression stages are needed to meet the stringent hardware constraints. In this paper, we introduce a novel accuracy-driven compressive training algorithm. It consists of a two-stage flow: first, layers are sorted by means of heuristic rules according to their significance; second, a modified stochastic gradient descent optimization is applied on less significant layers such that their representation is collapsed into a constrained subspace. Experimental results demonstrate that our approach achieves remarkable compression rates with low accuracy loss (<1%)

Multidisciplinary Digital Publishing Institute

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Modeling of thermally induced skew variations in clock distribution network

Author: Calimera Andrea
Liu Wei
Macii Alberto
Macii Enrico
Poncino Massimo
Sassone Alessandro
Publication venue: IEEE
Publication date: 01/01/2011
Field of study

Clock distribution network is sensitive to large thermal gradients on the die as the performance of both clock buffers and interconnects are affected by temperature. A robust clock network design relies on the accurate analysis of clock skew subject to temperature variations. In this work, we address the problem of thermally induced clock skew modeling in nanometer CMOS technologies. The complex thermal behavior of both buffers and interconnects are taken into account. In addition, our characterization of the temperature effect on buffers and interconnects provides valuable insight to designers about the potential impact of thermal variations on clock networks. The use of industrial standard data format in the interface allows our tool to be easily integrated into existing design flow

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Communication-Efficient Federated Learning with Gradual Layer Freezing

Author: Andrea Calimera
Enrico Macii
Erich Malan
Valentino Peluso
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Buffering of frequent accesses for reduced cache aging

Author: Calimera Andrea
Loghi M.
Macii Enrico
Poncino Massimo
Publication venue: ACM
Publication date: 01/01/2011
Field of study

Archivio istituzionale della ricerca - Università degli Studi di Udine

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Enabling DVFS Side-Channel Attacks for Neural Network Fingerprinting in Edge Inference Services

Author: Calimera Andrea
Macii Enrico
Malan Erich
Peluso Valentino
Publication venue: IEEE
Publication date: 01/01/2023
Field of study

The Inference-as-a-Service (IaaS) delivery model provides users access to pre-trained deep neural networks while safeguarding network code and weights. However, IaaS is not immune to security threats, like side-channel attacks (SCAs), that exploit unintended information leakage from the physical characteristics of the target device. Exposure to such threats grows when IaaS is deployed on distributed computing nodes at the edge. This work identifies a potential vulnerability of low-power CPUs that facilitates stealing the deep neural network architecture without physical access to the hardware or interference with the execution flow. Our approach relies on a Dynamic Voltage and Frequency Scaling (DVFS) side-channel attack, which monitors the CPU frequency state during the inference stages. Specifically, we introduce a dedicated load-testing methodology that imprints distinguishable signatures of the network on the frequency traces. A machine learning classifier is then used to infer the victim architecture. Experimental results on two commercial ARM Cortex-A CPUs, the A72 and A57, demonstrate the attack can identify the target architecture from a pool of 12 convolutional neural networks with an average accuracy of 98.7% and 92.4

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

TVFS: Topology Voltage Frequency Scaling for Reliable Embedded ConvNets

Author: Calimera Andrea
Peluso Valentino
Rizzo Roberto Giorgio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

This brief introduces Topology Voltage Frequency Scaling (TVFS), a performance management technique for embedded Convolutional Neural Networks (ConvNets) deployed on low-power CPUs. Using TVFS, pre-trained ConvNets can be efficiently processed over a continuous stream of data, enabling reliable and predictable multi-inference tasks under latency constraints. Experimental results, collected from an image classification task built with MobileNet-v1 and ported into an ARM Cortex-A15 core, reveal TVFS holds fast and continuous inference (from few runs, up to 2000), ensuring a limited accuracy loss (from 0.9% to 3.1%), and better thermal profiles (average temperature 16.4 °C below the on-chip critical threshold)

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Graphene-PLA (GPLA): A compact and ultra-low power logic array architecture

Author: CALIMERA ANDREA
MACII Enrico
PONCINO MASSIMO
TENACE VALERIO
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

The key characteristics of the next generation of ICs for wearable applications include high integration density, small area, low power consumption, high energy-efficiency, reliability and enhanced mechanical properties like stretchability and transparency. The proper mix of new materials and novel integration strategies is the enabling factor to achieve those design specifications. Moving toward this goal, we introduce a graphene-based regular logic-array structure for energy efficient digital computing. It consists of graphene p-n junctions arranged into a regular mesh. The obtained structure resembles that of Programmable Logic Arrays (PLAs), hence the name Graphene-PLAs (GPLAs); the high expressive power of graphene p-n junctions and their resistive nature enables the implementation of ultra-low power adiabatic logic circuits

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Dynamic ConvNets on Tiny Devices via Nested Sparsity

Author: Andrea Calimera
Antonio Cipolletta
Luca Mocerino
Matteo Grimaldi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/03/2022
Field of study

This work introduces a new training and compression pipeline to build nested sparse convolutional neural networks (ConvNets), a class of dynamic ConvNets suited for inference tasks deployed on resource-constrained devices at the edge of the Internet of Things. A nested sparse ConvNet consists of a single ConvNet architecture, containing N sparse subnetworks with nested weights subsets, like a Matryoshka doll, and can trade accuracy for latency at runtime, using the model sparsity as a dynamic knob. To attain high accuracy at training time, we propose a gradient masking technique that optimally routes the learning signals across the nested weight subsets. To minimize the storage footprint and efficiently process the obtained models at inference time, we introduce a new sparse matrix compression format with dedicated compute kernels that fruitfully exploit the characteristic of the nested weights subsets. Tested on image classification and object detection tasks on an off-the-shelf ARM-M7 microcontroller unit (MCU), nested sparse ConvNets outperform variable-latency solutions naively built assembling single sparse models trained as stand-alone instances, achieving 1) comparable accuracy; 2) remarkable storage savings; and 3) high performance. Moreover, when compared to state-of-the-art dynamic strategies, such as dynamic pruning and layer width scaling, nested sparse ConvNets turn out to be Pareto optimal in the accuracy versus latency space

arXiv.org e-Print Archive

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)